57 research outputs found

    Context-based RNA-seq mapping

    Get PDF
    In recent years, the sequencing of RNA (RNA-seq) using next generation sequencing (NGS) technology has become a powerful tool for analyzing the transcriptomic state of a cell. Modern NGS platforms allow for performing RNA-seq experiments in a few days, resulting in millions of short sequencing reads. A crucial step in analyzing RNA-seq data generally is determining the transcriptomic origin of the sequencing reads (= read mapping). In principal, read mapping is a sequence alignment problem, in which the short sequencing reads (30 - 500 nucleotides) are aligned to much larger reference sequences such as the human genome (3 billion nucleotides). In this thesis, we present ContextMap, an RNA-seq mapping approach that evaluates the context of the sequencing reads for determining the most likely origin of every read. The context of a sequencing read is defined by all other reads aligned to the same genomic region. The ContextMap project started with a proof of concept study, in which we showed that our approach is able to improve already existing read mapping results provided by other mapping programs. Subsequently, we developed a standalone version of ContextMap. This implementation no longer relied on mapping results of other programs, but determined initial alignments itself using a modification of the Bowtie short read alignment program. However, the original ContextMap implementation had several drawbacks. In particular, it was not able to predict reads spanning over more than two exons and to detect insertions or deletions (indels). Furthermore, ContextMap depended on a modification of a specific Bowtie version. Thus, it could neither benefit of Bowtie updates nor of novel developments (e.g. improved running times) in the area of short read alignment software. For addressing these problems, we developed ContextMap 2, an extension of the original ContextMap algorithm. The key features of ContextMap 2 are the context-based resolution of ambiguous read alignments and the accurate detection of reads crossing an arbitrary number of exon-exon junctions or containing indels. Furthermore, a plug-in interface is provided that allows for the easy integration of alternative short read alignment programs (e.g. Bowtie 2 or BWA) into the mapping workflow. The performance of ContextMap 2 was evaluated on real-life as well as synthetic data and compared to other state-of-the-art mapping programs. We found that ContextMap 2 had very low rates of misplaced reads and incorrectly predicted junctions or indels. Additionally, recall values were as high as for the top competing methods. Moreover, the runtime of ContextMap 2 was at least two fold lower than for the best competitors. In addition to the mapping of sequencing reads to a single reference, the ContextMap approach allows the investigation of several potential read sources (e.g. the human host and infecting pathogens) in parallel. Thus, ContextMap can be applied to mine for infections or contaminations or to map data from meta-transcriptomic studies. Furthermore, we developed methods based on mapping-derived statistics that allow to assess confidence of mappings to identified species and to detect false positive hits. ContextMap was evaluated on three real-life data sets and results were compared to metagenomics tools. Here, we showed that ContextMap can successfully identify the species contained in a sample. Moreover, in contrast to most other metagenomics approaches, ContextMap also provides read mapping results to individual species. As a consequence, read mapping results determined by ContextMap can be used to study the gene expression of all species contained in a sample at the same time. Thus, ContextMap might be applied in clinical studies, in which the influence of infecting agents on host organisms is investigated. The methods presented in this thesis allow for an accurate and fast mapping of RNA-seq data. As the amount of available sequencing data increases constantly, these methods will likely become an important part of many RNA-seq data analyses and thus contribute valuably to research in the field of transcriptomics

    Context-based RNA-seq mapping

    Get PDF
    In recent years, the sequencing of RNA (RNA-seq) using next generation sequencing (NGS) technology has become a powerful tool for analyzing the transcriptomic state of a cell. Modern NGS platforms allow for performing RNA-seq experiments in a few days, resulting in millions of short sequencing reads. A crucial step in analyzing RNA-seq data generally is determining the transcriptomic origin of the sequencing reads (= read mapping). In principal, read mapping is a sequence alignment problem, in which the short sequencing reads (30 - 500 nucleotides) are aligned to much larger reference sequences such as the human genome (3 billion nucleotides). In this thesis, we present ContextMap, an RNA-seq mapping approach that evaluates the context of the sequencing reads for determining the most likely origin of every read. The context of a sequencing read is defined by all other reads aligned to the same genomic region. The ContextMap project started with a proof of concept study, in which we showed that our approach is able to improve already existing read mapping results provided by other mapping programs. Subsequently, we developed a standalone version of ContextMap. This implementation no longer relied on mapping results of other programs, but determined initial alignments itself using a modification of the Bowtie short read alignment program. However, the original ContextMap implementation had several drawbacks. In particular, it was not able to predict reads spanning over more than two exons and to detect insertions or deletions (indels). Furthermore, ContextMap depended on a modification of a specific Bowtie version. Thus, it could neither benefit of Bowtie updates nor of novel developments (e.g. improved running times) in the area of short read alignment software. For addressing these problems, we developed ContextMap 2, an extension of the original ContextMap algorithm. The key features of ContextMap 2 are the context-based resolution of ambiguous read alignments and the accurate detection of reads crossing an arbitrary number of exon-exon junctions or containing indels. Furthermore, a plug-in interface is provided that allows for the easy integration of alternative short read alignment programs (e.g. Bowtie 2 or BWA) into the mapping workflow. The performance of ContextMap 2 was evaluated on real-life as well as synthetic data and compared to other state-of-the-art mapping programs. We found that ContextMap 2 had very low rates of misplaced reads and incorrectly predicted junctions or indels. Additionally, recall values were as high as for the top competing methods. Moreover, the runtime of ContextMap 2 was at least two fold lower than for the best competitors. In addition to the mapping of sequencing reads to a single reference, the ContextMap approach allows the investigation of several potential read sources (e.g. the human host and infecting pathogens) in parallel. Thus, ContextMap can be applied to mine for infections or contaminations or to map data from meta-transcriptomic studies. Furthermore, we developed methods based on mapping-derived statistics that allow to assess confidence of mappings to identified species and to detect false positive hits. ContextMap was evaluated on three real-life data sets and results were compared to metagenomics tools. Here, we showed that ContextMap can successfully identify the species contained in a sample. Moreover, in contrast to most other metagenomics approaches, ContextMap also provides read mapping results to individual species. As a consequence, read mapping results determined by ContextMap can be used to study the gene expression of all species contained in a sample at the same time. Thus, ContextMap might be applied in clinical studies, in which the influence of infecting agents on host organisms is investigated. The methods presented in this thesis allow for an accurate and fast mapping of RNA-seq data. As the amount of available sequencing data increases constantly, these methods will likely become an important part of many RNA-seq data analyses and thus contribute valuably to research in the field of transcriptomics

    Diabetes mellitus

    Get PDF
    An dieser Studie nahmen insgesamt 506 Probanden im Alter von 13-88 Jahren teil. 262 Männer und 244 Frauen. 254 Probanden waren Diabetiker und 252 gaben an gesund zu sein. Es wurden Patienten mit Diabetes 1 und 2 in verschiedenen österreichischen Diabeteszentren mittels eines standardisierten Fragebogens interviewt. Die Kontrollgruppe bildeten gesunde Probanden. Ein Arm der Studie wurde mit demselben Fragebogen über das Internet durchgeführt. Der Median des BMI ist bei den männlichen Typ-2-Diabetikern niedriger als bei den weiblichen (29,37kg/m² / 31,94kg/m²). Beim Taillenumfang und bei der (WHR) ist das Verhältnis umgekehrt. Bei den Männern beträgt der Median 107,00 (1,00) und bei den Frauen 105,00 (0,92). Bei der Selbsteinschätzung des Gewichtsstatus liefern die Frauen bessere Ergebnisse als die Männer. Besseres Wissen über die Krankheit ist bei Diabetikern vorhanden, weil sie in ihren betreuenden Facheinrichtungen gute Schulungen besuchen. Bei den Typ-2-Diabetikern konnte bei Männern und Frauen ein Unterschied zwischen der WHR und dem BMI nachgewiesen werden, bei den Parametern Taillenumfang und beim Alter bestehen hier keine Signifikanzen.The 506 probands in this study were 13-88 years old. 262 males und 244 females. 254 suffered on diabetes and 252 told to be non-diabetics. For this study there were in- and outpatients interviewed with a standardized questionnaire in different Austrian diabetes-care-centers. The control-group were the non-diabetic participants of the study. One arm of the study was organized with the same questionnaire programmed for an internet-based-interview. The median of BMI was lower with the male type-2-diabetics than with the female ones (29,37kg/m² / 31,94kg/m²). In the case of waist-circumference and (WHR) the rates were opposite. With the male diabetic participants the median was 107,00 (1,00) and with the females it was 105,00 (0,92). In the test of self-estimation of the own weight-status women had better results than men. And better knowledge about diabetes is available in diabetic, because they are participants in numerous diabetes educational programs in the national diabetes centers. In the sample of the type-2-diabetics between males and females there was found a difference in WHR and BMI. The parameters WC and age showed no significancies

    A context-based approach to identify the most likely mapping for RNA-seq experiments

    Get PDF
    Background: Sequencing of mRNA (RNA-seq) by next generation sequencing technologies is widely used for analyzing the transcriptomic state of a cell. Here, one of the main challenges is the mapping of a sequenced read to its transcriptomic origin. As a simple alignment to the genome will fail to identify reads crossing splice junctions and a transcriptome alignment will miss novel splice sites, several approaches have been developed for this purpose. Most of these approaches have two drawbacks. First, each read is assigned to a location independent on whether the corresponding gene is expressed or not, i.e. information from other reads is not taken into account. Second, in case of multiple possible mappings, the mapping with the fewest mismatches is usually chosen which may lead to wrong assignments due to sequencing errors. Results: To address these problems, we developed ContextMap which efficiently uses information on the context of a read, i.e. reads mapping to the same expressed region. The context information is used to resolve possible ambiguities and, thus, a much larger degree of ambiguities can be allowed in the initial stage in order to detect all possible candidate positions. Although ContextMap can be used as a stand-alone version using either a genome or transcriptome as input, the version presented in this article is focused on refining initial mappings provided by other mapping algorithms. Evaluation results on simulated sequencing reads showed that the application of ContextMap to either TopHat or MapSplice mappings improved the mapping accuracy of both initial mappings considerably. Conclusions: In this article, we show that the context of reads mapping to nearby locations provides valuable information for identifying the best unique mapping for a read. Using our method, mappings provided by other state-of-the-art methods can be refined and alignment accuracy can be further improved

    ContextMap 2: fast and accurate context-based RNA-seq mapping

    Get PDF
    Background Mapping of short sequencing reads is a crucial step in the analysis of RNA sequencing (RNA-seq) data. ContextMap is an RNA-seq mapping algorithm that uses a context-based approach to identify the best alignment for each read and allows parallel mapping against several reference genomes. Results In this article, we present ContextMap 2, a new and improved version of ContextMap. Its key novel features are: (i) a plug-in structure that allows easily integrating novel short read alignment programs with improved accuracy and runtime; (ii) context-based identification of insertions and deletions (indels); (iii) mapping of reads spanning an arbitrary number of exons and indels. ContextMap 2 using Bowtie, Bowtie 2 or BWA was evaluated on both simulated and real-life data from the recently published RGASP study. Conclusions We show that ContextMap 2 generally combines similar or higher recall compared to other state-of-the-art approaches with significantly higher precision in read placement and junction and indel prediction. Furthermore, runtime was significantly lower than for the best competing approaches. ContextMap 2 is freely available at http://www.bio.ifi.lmu.de/ContextMap webcite

    ContextMap 2: fast and accurate context-based RNA-seq mapping

    Get PDF
    Background Mapping of short sequencing reads is a crucial step in the analysis of RNA sequencing (RNA-seq) data. ContextMap is an RNA-seq mapping algorithm that uses a context-based approach to identify the best alignment for each read and allows parallel mapping against several reference genomes. Results In this article, we present ContextMap 2, a new and improved version of ContextMap. Its key novel features are: (i) a plug-in structure that allows easily integrating novel short read alignment programs with improved accuracy and runtime; (ii) context-based identification of insertions and deletions (indels); (iii) mapping of reads spanning an arbitrary number of exons and indels. ContextMap 2 using Bowtie, Bowtie 2 or BWA was evaluated on both simulated and real-life data from the recently published RGASP study. Conclusions We show that ContextMap 2 generally combines similar or higher recall compared to other state-of-the-art approaches with significantly higher precision in read placement and junction and indel prediction. Furthermore, runtime was significantly lower than for the best competing approaches. ContextMap 2 is freely available at http://www.bio.ifi.lmu.de/ContextMap webcite

    Widespread disruption of host transcription termination in HSV-1 infection.

    Get PDF
    Herpes simplex virus 1 (HSV-1) is an important human pathogen and a paradigm for virus-induced host shut-off. Here we show that global changes in transcription and RNA processing and their impact on translation can be analysed in a single experimental setting by applying 4sU-tagging of newly transcribed RNA and ribosome profiling to lytic HSV-1 infection. Unexpectedly, we find that HSV-1 triggers the disruption of transcription termination of cellular, but not viral, genes. This results in extensive transcription for tens of thousands of nucleotides beyond poly(A) sites and into downstream genes, leading to novel intergenic splicing between exons of neighbouring cellular genes. As a consequence, hundreds of cellular genes seem to be transcriptionally induced but are not translated. In contrast to previous reports, we show that HSV-1 does not inhibit co-transcriptional splicing. Our approach thus substantially advances our understanding of HSV-1 biology and establishes HSV-1 as a model system for studying transcription termination.This work was supported by MRC Fellowship grant G1002523 and NHSBT grant WP11-05 to LD, and DFG grant FR2938/1–2 to C.C.F. We thank Viv Connor for excellent technical assistance and Professor Rozanne Sandri-Goldin (University of California) for the ΔICP27 mutant and complementing cell line. The support of the Cluster of Excellence (Nucleotide lab) to P.R. is acknowledged.This is the final version of the article. It first appeared from NPG via http://dx.doi.org/10.1038/ncomms812

    Childhood Stroke: Awareness, Interest, and Knowledge Among the Pediatric Community

    Get PDF
    Objective: Acute childhood stroke is an emergency requiring a high level of awareness among first-line healthcare providers. This survey serves as an indicator of the awareness of, the interest in, and knowledge of childhood stroke of German pediatricians.Methods: Thousand six hundred and ninety-seven physicians of pediatric in- and outpatient facilities in Bavaria, Germany, were invited via email to an online-survey about childhood stroke.Results: The overall participation rate was 14%. Forty-six percent of participants considered a diagnosis of childhood stroke at least once during the past year, and 47% provide care for patients who have suffered childhood stroke. The acronym FAST (Face-Arm-Speech-Time-Test) was correctly cited in 27% of the questionnaires. Most commonly quoted symptoms of childhood stroke were hemiparesis (90%), speech disorder (58%), seizure (44%), headache (40%), and impaired consciousness (33%). Migraine (63%), seizure (39%), and infections of the brain (31%) were most frequently named as stroke mimics. Main diagnostic measures indicated were magnetic resonance imaging (MRI) (96%) and computer tomography (CT) (55%). Main therapeutic strategies were thrombolysis (80%), anticoagulation (41%), neuroprotective measures, and thrombectomies (15% each). Thirty-nine percent of participants had taken part in training sessions, 61% studied literature, 37% discussed with colleagues, and 25% performed internet research on childhood stroke. Ninety-three percent of participants approve skill enhancement, favoring training sessions (80%), publications (43%), and web based offers (35%). Consent for offering a flyer on the topic to caregivers in facilities was given in 49%.Conclusion: Childhood stroke constitutes a topic of clinical importance to pediatricians. Participants demonstrate a considerable level of comprehension concerning the subject, but room for improvement remains. A multi-modal approach encompassing an elaborate training program, regular educational publications in professional journals, and web based offers could reach a broad range of health care providers. Paired with a public adult and childhood stroke awareness campaign, these efforts could contribute to optimize the care for children suffering from stroke

    Die Pfeilschussverletzung des Ötzi - War sie primäre Todesursache?

    No full text
    Die genauen Umstände, unter denen der Eismann sein Ende fand, werden wahrscheinlich nie genau aufgeklärt werden können. Basierend auf radiologischen Daten (CT-Volumendatensätze), wurde geprüft, ob eine tödliche Verletzung des Eismannes durch die Pfeilspitze möglich war. Auf die CT-Daten des Ötzi, die 1994 in Innsbruck gewonnen wurden, sind mittels landmarkunterstützter B-Spline-Registrierung CT-Datensätze von Vergleichsindividuen auf die Ötzi-Topographie gebracht und untersucht worden, ob sich deren große neurovasculäre Strukturen der linken oberen Extremität in der Nähe der in diese Datensätze projizierten Pfeilspitze befinden. Die Eindringtiefe der Pfeilspitze wurde vermessen und eine virtuelle Reponierung der linken Schulter des Ötzi sollte weiteren Aufschluss über die Einschussrichtung des Pfeiles bringen.The exact circumstances of the Iceman’s death still remain a mystery and may be never solved in detail. Based on radiologic data (CT-volume datasets) it was examined, if a deadly injury by the arrowtip could be possible. On the CT-volume data of the iceman from 1994, three different CT-volume data from contrast enhanced examinations were registered by landmark based b-spline algorithm. In this way it was tested, if the large neurovascular structures of the left upper extremity from the tested CT-volumes are near the arrow-head projected in these deformed data sets. The depth of penetration of the arrow-head was measured and a virtual reposition of the left shoulder of the iceman was carried out to get new estimations about the direction where the deadly arrow came from

    Prediction of Poly(A) Sites by Poly(A) Read Mapping.

    Get PDF
    RNA-seq reads containing part of the poly(A) tail of transcripts (denoted as poly(A) reads) provide the most direct evidence for the position of poly(A) sites in the genome. However, due to reduced coverage of poly(A) tails by reads, poly(A) reads are not routinely identified during RNA-seq mapping. Nevertheless, recent studies for several herpesviruses successfully employed mapping of poly(A) reads to identify herpesvirus poly(A) sites using different strategies and customized programs. To more easily allow such analyses without requiring additional programs, we integrated poly(A) read mapping and prediction of poly(A) sites into our RNA-seq mapping program ContextMap 2. The implemented approach essentially generalizes previously used poly(A) read mapping approaches and combines them with the context-based approach of ContextMap 2 to take into account information provided by other reads aligned to the same location. Poly(A) read mapping using ContextMap 2 was evaluated on real-life data from the ENCODE project and compared against a competing approach based on transcriptome assembly (KLEAT). This showed high positive predictive value for our approach, evidenced also by the presence of poly(A) signals, and considerably lower runtime than KLEAT. Although sensitivity is low for both methods, we show that this is in part due to a high extent of spurious results in the gold standard set derived from RNA-PET data. Sensitivity improves for poly(A) sites of known transcripts or determined with a more specific poly(A) sequencing protocol and increases with read coverage on transcript ends. Finally, we illustrate the usefulness of the approach in a high read coverage scenario by a re-analysis of published data for herpes simplex virus 1. Thus, with current trends towards increasing sequencing depth and read length, poly(A) read mapping will prove to be increasingly useful and can now be performed automatically during RNA-seq mapping with ContextMap 2
    • …
    corecore